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DETAILED ACTION 

1 . Claims 1-37 have been examined. 

Acknowledgment of papers filed: Amendment on 1 1 March 2008. The papers 
filed have been placed on record. 

Claim Rejections - 35 USC § 103 

2. The following is a quotation of 35 U.S.C. 1 03(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

3. Claims 1- 3, 5-9, 11-13, 15, 18-22, 25- 29 and 31 are rejected under 35 U.S.C. 
103(a) as being unpatentable over Rotenberg et al. (Slipstream Execution Mode for 
CMP-Based Multiprocessors) herein referred to as Rotenberg, in view of Jamil (U.S. 
Patent Application Publication No. US 2003/0126365 A1). 

4. Referring to claim 1 , Rotenberg discloses an apparatus comprising: a first 
processor (Such as Processor 0, see Figure 2) to execute a main thread instruction 
stream (Task 0 (R)) that includes a delinquent instruction (any load which misses in the 
R-stream); a second processor (Such as Processor 1 ; see Figure 2) to execute a helper 
thread instruction stream (A-Stream) that includes a subset of the main thread 
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instruction stream (See section 3.2, second paragraph), wherein the subset includes the 
delinquent instruction (Any program will inherently include cache misses); wherein said 
first and second processors each include a private data cache (L1 data Cache, see 
section 2, first paragraph); a shared memory system (Unified L2 cache; see section 2, 
first paragraph) coupled to said first processor and to said second processor; and logic 
to retrieve, responsive to a miss of requested data (any data not in L1 cache) for the 
delinquent instruction (instruction referencing data not in cache) in the private cache of 
the second processor (lines which are not referenced at all by the A-Stream, see 2nd 
last paragraph of section 3.4), the requested data from the shared memory system (see 
see 2nd last paragraph of section 3.4); the logic further to provide requested data to the 
first processor (see the first paragraph of section 3 regarding preloading shared data 
into the L2 cache). 

Rotenberg does not expressly disclose that the logic further to provide the 
requested data to the private data cache of the first processor. 

Jamil teaches that the logic is further to provide the requested data to the private 
data cache of the first processor (paragraph 4, lines 18-21 ). 

For this modification to be successful, instead of writing data to the shared cache 
to be read by another processor (Rotenberg; 2nd last paragraph of section 3.4), it would 
be written to the private cache of that processor (Jamil, paragraph 4, lines 18-21). In 
this instance, when the scout thread is prefetching data, it would prefetch the data by 
reading the data into its private cache to be used, and then transfer it to the private 
cache of the primary processor. 
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It would have been obvious for one of ordinary skill in the art at the time of the 
invention to have modified the invention of Rotenberg by making the logic further to 
provide the requested data to the private data cache of the first processor as taught by 
Jamil in order to decrease the access time of data required by the primary processor 
because communication with on-chip caches or caches of the same level, is faster than 
communicating through the use of an external shared cache (Jamil, paragraph 5). 

5. Regarding claim 2, Rotenberg also discloses the first processor, second 
processor and logic are included within a chip package (see Figure 2 (c) ). 

6. Regarding claim 3, Rotenberg also discloses the shared memory system 
includes a shared cache (see the second paragraph of the Abstract). 

7. Regarding claim 5, Rotenberg also discloses the shared cache is included within 
a chip package (see the second paragraph of the Abstract). 

8. Regarding claims 6, 20, and 27 Rotenberg does not expressly disclose that the 
logic is further to provide the requested data from the shared memory system to the 
private data cache of the second processor. 

Jamil teaches logic providing requested data from the shared memory system to 
the private data cache of the second processor (see paragraph 23, lines 2-7) 
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It would have been obvious for one of ordinary skill in the art at the time of the 
invention to have modified the combined invention of Rotenberg/Jamil (see above 
regarding claim 1) by providing requested data from the shared memory system to the 
private data cache of the second processor as taught by Jamil in order to decrease 
access time for the second processor by pulling data into the private cache. 

Regarding claim 7, Rotenberg also discloses said first and second processors 
are included in a plurality of n processors (See Figure 2 (c) ), each of said plurality of 
processors is coupled to the shared memory system (see the first paragraph of section 
2); and each of said n plurality of processors includes a private data cache (L1 cache, 
see the first paragraph of section 2), wherein n>2 (see Figure 2). 

Regarding claim 8, Rotenberg does not expressly disclose that the logic is further 
to provide the requested data from the shared memory system to each of the n private 
data caches. 

Jamil teaches that the logic is further to provide the requested data from the 
shared memory system to each of the n private data caches (see paragraph 23, lines 2- 
7). 

The combination would be successful if when there is a cache miss occurs in the 
second processor the requested data would be loaded into all private data caches. 
When the invention of Rotenberg fetches data into cache, the data is loaded into the 
shared cache, which is accessible by all processors. The combination would therefore 
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have to make the data accessible to all processors and must do that by transferring the 
data to each of the private data caches. 

It would have been obvious for one of ordinary skill in the art at the time of the 
invention to have modified the combined invention of Rotenberg /Jamil (see above 
regarding claim 1) by making the logic provide the requested data from the shared 
memory system to each of the n private data caches as taught by Jamil in order to 
decrease access time for data needed by the processors (see above regarding claim 1 ) 

9. Regarding claim 9, Rotenberg discloses the logic is further to provide the 

requested data from the shared memory system to a subset of the n private data 

caches, the subset including x (1 ; first processor; see above regarding claim 1 ) of the n 

(see Figure 2) private data caches, where 0<x<n (0 <1< 2). 

Note that if the logic provides the requested data to the private cache of the first 
processor, it would have provided the data to x (1) private data cache. Further note that 
the subset can include all processors because The American Heritage College 
Dictionary defines subset as "A set contained within a set". A set can be contained 
within itself. 

10. Claim 11 recites equivalent limitations as set forth in claim in claim 1 and is 
therefore rejected using the same grounds as claim 1 . 

1 1 . Regarding claim 12, Rotenberg also discloses a shared memory system coupled 
to said first processor and to said second processor (see the first paragraph of section 
2); wherein said logic is further to retrieve the requested data from the shared memory 
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system if the requested data is not available in the other private data cache (see above 
regarding claim 6). 

Note that the limitation" wherein said logic is further... other private data cache" 
is equivalent to the limitation of claim 6 and is rejected on the same grounds. 

Regarding claim 13, Rotenberg does not expressly disclose that the logic is 
included within an interconnect, wherein the interconnect is to provide networking logic 
for communication among the first processor, the second processor, and the shared 
memory system. 

Jamil teaches that the logic is included within an interconnect (refs. 151-156, 
130, see fig. 1 ), wherein the interconnect is to provide networking logic for 
communication among the first processor, the second processor (see paragraph 23, 
lines 4-7), and the shared memory system (see paragraph 23, lines 6-7). 

It would have been obvious for one of ordinary skill in the art at the time of the 
invention to have modified the invention of Rotenberg/Jamil by including logic within an 
interconnect, wherein the interconnect is to provide networking logic for communication 
among the first processor, the second processor, and the shared memory system in 
order to maintain cache coherency between caches without routing data off-chip when 
storing data in private caches (see Jamil, abstract). This increases memory throughput 
between processors (see Jamil, paragraph 5). 
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12. Claim 15 recites equivalent limitations as claim 3 and is rejected under the same 
grounds. 

13. Claims 18 and 25 recite equivalent limitations as claims 1 and is rejected for the 
same reasons. 

14. Claims 19 and 26 recite limitations already discussed above regarding claim 1 
and are rejected for the same reasons. 



15. Claims 21 and 28 recite limitations already discussed above regarding claims 7 
and 8 and are rejected for the same reasons. 

16. Regarding claims 22 and 29, Rotenberg also disclose prefetching further 
comprises: retrieving the load data from a private cache of a helper core; and providing 
the load data to the private cache of the main core (see above regarding claim 8; if data 
is sent to all caches, it will be sent to the cache of the main core). 

Claim 31 recites equivalent limitations as stated in claim 12 and is therefore rejected 
using the same grounds. 
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1 7. Claims 4, 1 6, and 32-37 are rejected under 35 U.S.C. 1 03(a) as being 
unpatentable over Rotenberg in view of Jamil and Jeddeloh (U.S. Patent No. US 6,789, 
168 B2). 

Regarding claims 4 and 16, Rotenberg /Jamil disclose the apparatus of claim 3 
and claim 15. 

Rotenberg /Jamil does not expressly disclose that the shared memory system 
includes a second shared cache. 

Jeddeloh teaches that the shared memory system includes a second shared 
cache (col. 3, lines 66-67 & col. 4, lines 1-2). 

The invention of Rotenberg would have been modified by adding L3 cache 
implemented in the chipset of the computer in addition to the L2 cache. 

It would have been obvious for one of ordinary skill in the art at the time of the 
invention to have modified the combined invention of Rotenberg /Jamil because the use 
of L3 cache increases the overall size of the cache making memory accesses less 
frequent and therefore increasing overall system bandwidth. 

18. Regarding claim 32, Rotenberg discloses a system comprising: a memory 
system (see first paragraph of Abstract); a first processor (Processor 0, see figure 2), 
coupled to the memory system, to execute a first instruction stream (R-Stream); a 
second processor (Processor 1 , see figure 2), coupled to the memory system, to 
concurrently execute a second instruction stream (A-Stream, see above regarding claim 
1)- 
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Rotenberg does not expressly disclose helper threading logic to provide fill data 
prefetched by the second processor to the first processor. 

Jamil teaches helper threading logic (fig. 1, refs. 151-156, 130) to provide fill data 
prefetched by the second processor to the first processor (Jamil paragraph 23, lines 4- 
5; see above regarding claim 1). 

It would have been obvious for one of ordinary skill in the art at the time of the 
invention to have modified the invention of Rotenberg by adding helper threading logic 
to provide fill data prefetched by the second processor to the first processor as taught 
by Jamil in order to decrease cache access time for the main processor (see above 
regarding claim 1). 

Further, Rotenberg /Jamil does not expressly disclose that the memory system 
includes a dynamic random access memory. 

Jeddeloh teaches a memory system that includes a dynamic random access 
memory (see paragraph 1 ). 

It would have been obvious at the time of the invention for one of ordinary skill in 
the art to have modified the combined invention of Rotenberg/Jamil by using a memory 
system that includes a dynamic random access memory as taught by Jeddeloh in order 
to decreases the physical size of the cache as compared to SRAM (see Jeddeloh col. 4, 
lines 52-54). 

19. Regarding claim 33, Rotenberg also discloses the system of claim 32, wherein: 
the helper threading logic is further to push the fill data to the first processor before the 
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fill data is requested by an instruction of the first instruction stream (see above 
regarding claim 1). 

Note that Rotenberg updates the shared memory (the memory being accessed 
by the main processor) as soon as the scout processor receives it. Using the cache 
setup of Jamil, the memory that would be updated would be the private data of the main 
processor which would be done at the time the data is reached in the scout thread 
ahead of the main thread. 

20. Claim 34 recites an equivalent limitation as set forth in claim 22 and is therefore 
rejected using the same grounds. 

21 . Regarding claim 35, Rotenberg also discloses the helper threading logic is 
further to provide the fill data to the first processor from the memory system (see above 
regarding claim 1). 

Note that the fill data comes from the shared memory indirectly through the 
cache of the second processor. 

22. Claim 36 recites equivalent limitations discussed above regarding claim 13 and is 
rejected for the same reasons. 

23. Claim 37 recites equivalent limitations discussed above regarding claim 3 and is 
rejected for the same reasons. 
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24. Claims 10 and 17 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Rotenberg in view of Jamil and Luk (U.S. Patent Application Publication No. US 
2002/0055964 A1). 

25. Regarding claim 10, Rotenberg /Jamil do not expressly disclose that the first 
processor is further to trigger the second processor's execution of the helper thread 
instruction stream responsive to a trigger instruction in the main thread instruction 
stream. 

Luk teaches the use of a trigger instruction to use in a main thread to start a 
helper thread (paragraph 8-9). 

It would have been obvious to one of ordinary skill in the art at the time of the 
invention to modify the combined invention of Rotenberg /Jamil by including an 
instruction in the main instruction stream to start execution for the helper thread as 
taught by Luk in order to use hardware to prefetch in situations where prefetching will 
help the current thread and being able to stop the pre-execution thread if it will not help 
and another thread needs to use the hardware. 

26. Claim 17 recites equivalent limitations as stated in claim 10 and is therefore 
rejected using the same grounds. 

27. Claims 14, 23, 24, and 30 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Rotenberg in view of Jamil and Dhong (U.S. Patent No. 6,138, 208). 
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Regarding claim 14, Rotenberg /Jamil discloses the apparatus of claim 13, 
wherein: the first and second processor are each included in a plurality of n processors 
(n = 2; only the first and second processors); and the interconnect is further to 
broadcast a request for the requested data to each of the n processors and to the 
shared memory system (Jamil, paragraph 24, lines 11-18). 

Rotenberg /Jamil do not expressly disclose that the requests are done 
concurrently. 

Dhong teaches a method for concurrently requesting data from two levels of 
cache (col. 4, lines 35-43). 

It would have been obvious for one of ordinary skill in the art at the time of the 
invention to modify the combined invention of Rotenberg /Jamil (see above regarding 
claim 1) to concurrently request data in the private data caches of private processors 
(L1 cache) and the shared data cache (L2 cache) as taught by Dhong in order to 
decrease the access time for the higher level of cache by overlapping L1 and L2 cache 
accesses (Dhong, col. 4, lines 40-43). 

28. Claim 23 recites equivalent limitations as stated in claim 14 and is therefore 
rejected using the same grounds. 

29. Claim 24 recites equivalent limitations as stated in claim 12 and is therefore 
rejected using the same grounds. 
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30. Claim 30 recites equivalent limitations as set forth in claim 14 and is therefore 
rejected using the same grounds 

Response to Amendment 



The declaration filed on 10 October 2006 under 37 CFR 1 .131 is sufficient to overcome 
the Damron reference. 



Response to Arguments 

Applicant's arguments have been considered but are moot in view of the new 
ground(s) of rejection. 

Conclusion 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Jesse R. Moll whose telephone number is (571)272- 
2703. The examiner can normally be reached on M-F 10:00 am - 6:30 pm EST. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Alford Kindred can be reached on (571)272-4037. The fax phone number 
for the organization where this application or proceeding is assigned is 571-273-8300. 
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Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a 
USPTO Customer Service Representative or access to the automated information 
system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 

Jesse R Moll 
Examiner 
Art Unit 2181 

JM 9/4/2007 

/Tonia LM Dollinger/ 

Primary Examiner, Art Unit 2181 



